Reproducible Research II

Adam M. Wilson

September 2015

Overview

Questions from last week?

  • Got Git?

Homework Review

Today

Outline

  • More Git
  • More Markdown
  • Introduction to ggplot2()
  • Introduction to spatial data in R

Working with Git and GitHub

Git Has Integrity

Everything checksummed before storage and then referred by checksum.

It’s impossible to change the contents of any file or directory without Git knowing. You can’t lose information in transit or get file corruption without Git being able to detect it.

A 40-character hexadecimal SHA-1 hash:

24b9da6552252987aa493b52f8696cd6d3b00373

Checksum

A way of reducing digital information to a unique ID:

alt text

Git doesn’t care about filenames, extensions, etc. It’ the information that matters…

The 3 states of files

staged, modified, committed

alt text

The important stuff is hidden in the .git folder.

Commit to GitHub from within RStudio

Steps:

  1. Stage
  2. Commit (with a message)
  3. Push

Staging

alt text

Select which files you want to commit.

Committing

alt text

Add a commit message and click commit.

Syncing (push)

alt text

Add a commit message and click commit.

Github

alt text

Files are updated/stored on GitHub

Git File Lifecycle

alt text

Git command line from RStudio

RStudio has limited functionality.

alt text

Git help

$ git help <verb>
$ git <verb> --help
$ man git-<verb>

For example, you can get the manpage help for the config command by running git help config

Git status

alt text

Similar to info in git tab in RStudio

Git config

git config shows you all the git configuration settings:

  • user.email
  • remote.origin.url (e.g. to connect to GitHub)

Branching

Branches used to develop features isolated from each other. alt text

Default: master branch. Use other branches for development/collaboration and merge them back upon completion.

Basic Branching

$ git checkout -b devel   # create new branch and switch to it


$ git checkout master  #switch back to master
$ git merge devel  #merge in changes from devel branch

But we won’t do much with branching in this course…

Git can do far more!

Check out the (free) book ProGIT

alt text

RMarkdown

RMarkdown: new file

alt text

RMarkdown: syntax

alt text

RMarkdown: output

alt text

RMarkdown: code

alt text

RMarkdown: chunks

Option default effect
eval TRUE Evalute the code and include the results
echo TRUE Display the code along with its results
warning TRUE Display warnings
error FALSE Display errors
message TRUE Display messages
tidy FALSE Reformat code to make it ‘tidy’
results “markup” “markup”, “asis”,“hold”,“hide”
cache FALSE Cache results for future renders
comment "##" Comment character to preface results
fig.width 7 Width in inches for plots
fig.height 7 Height in inches for plots

Chunk examples

R Code Chunks: Displaying Plots

alt text

Global chunk options

Use chunk options throughout a document: alt text

RMarkdown: render

alt text

Visualize .md on GitHub

Update the YAML header to keep the markdown file

From this:

title: "Untitled"
author: "Adam M. Wilson"
date: "September 21, 2015"
output: html_document

To this:

title: "Demo"
author: "Adam M. Wilson"
date: "September 21, 2015"
output: 
  html_document:
      keep_md: true

And click knit HTML to generate the output

Visualize example

alt text

Explore markdown functions

  1. Use File -> New File -> R Markdown to create a new markdown file.
  2. Use the Cheatsheet to add sections (# and ##) and some example narrative.
  3. Stage, Commit, Push!
  4. Explore the markdown file on your GitHub website.

Take 15 minutes & ask questions!

Colophon

Licensing: * Presentation: CC-BY-3.0 * Source code: MIT

References

See Rmd file for full references and sources